home *** CD-ROM | disk | FTP | other *** search
- ===========================================================================
-
- The Hoard Multiprocessor Memory Allocator
- <http://www.hoard.org>
-
- by Emery Berger
- <http://www.cs.utexas.edu/users/emery>
-
- Copyright (c) 1998, 1999, 2000, The University of Texas at Austin.
-
- ---------------------------------------------------------------------------
- emery@cs.utexas.edu | <http://www.cs.utexas.edu/users/emery>
- Department of Computer Sciences | <http://www.cs.utexas.edu>
- University of Texas at Austin | <http://www.utexas.edu>
- ===========================================================================
-
-
- What's Hoard?
- -------------
-
- The Hoard memory allocator is a fast, scalable, and memory-efficient
- memory allocator for shared-memory multiprocessors.
-
- Why Hoard?
- ----------
-
- Multithreaded programs that perform dynamic memory allocation do not
- scale because the heap is a bottleneck. When multiple threads
- simultaneously allocate or deallocate memory from the heap, they will
- be serialized while waiting for the heap lock. Programs making
- intensive use of the heap actually slow down as the number of
- processors increases. (Note: If you make a lot of use of the STL, you
- may not know it, but you are making a lot of use of the heap.)
-
- Hoard is a fast allocator that solves this problem. In addition, it
- has very reasonable bounds on memory consumption.
-
-
- How do I use it?
- ----------------
-
- Using Hoard is easy. It is written to work on any variant of UNIX that
- supports pthreads, and should compile out of the box using make. (See
- INSTALL for more details. Also, if you're using Windows or the BeOS,
- please read the appropriate NOTES file.)
-
- You can build Hoard in one of two ways (see INSTALL). Below, I assume
- you used the configure script.
-
- To link Hoard with the program foo (after doing "make install"):
-
- Linux:
- g++ foo.o -L/usr/local/lib -lhoard -lpthread -o foo
-
- Solaris:
- g++ foo.o -L/usr/local/lib -lhoard -lthread -lrt -o foo
-
- You *must* add "-lpthread" or "-lthread" to your list of libraries
- (except if you're using the sproc library on the SGI). Don't forget to
- add /usr/local/lib to your LD_LIBRARY_PATH environment variable.
-
- In UNIX, you might be able to avoid relinking your application and use
- Hoard just by changing the environment variable LD_PRELOAD, as in
-
- setenv LD_PRELOAD "/lib/libpthread.so.0 /usr/local/lib/libhoard.so"
-
- This won't work for applications compiled with the "-static" option.
-
-
- Did it work?
- ------------
-
- When you compile Hoard ("make"), you'll get six test programs:
- testmymalloc(-hoard), threadtest(-hoard), and
- cache-scratch(-hoard). The first one is just to measure raw,
- uniprocessor speed. The second one lets you observe scalability with
- multiple threads. The third tests the cache locality of your
- allocator (see cache-scratch.cpp for more details).
-
- ** NOTE: using the configure script dynamically links these
- ** (*-hoard) to the Hoard library. Static linking (using "make -f
- ** Makefile.orig") improves performance (at the cost of increasing the
- ** size of the executable).
-
- For instance,
-
- threadtest 2 1 800000
-
- will create two threads that will each allocate and free 400,000
- bytes. Compare this to
-
- threadtest-hoard 2 1 800000
-
- (the same program as above, but linked with Hoard).
-
- Likewise, try
-
- testmymalloc 100000 1
- and
- testmymalloc-hoard 100000 1
-
- to compare Hoard's uniprocessor performance with the stock allocator.
-
- For cache-scratch, try the following (on a P-processor machine):
-
- cache-scratch 1 1000 1 1000000
- cache-scratch P 1000 1 1000000
-
- cache-scratch-hoard 1 1000 1 1000000
- cache-scratch-hoard P 1000 1 1000000
-
- The ideal is a P-fold speedup.
-
- Hoard has been successfully built on a 2-processor x86 running Windows
- NT SP4 with and without CygWin, a 4-processor x86 box running Linux
- (Red Hat 6.0, kernel version 2.2.5-22 SMP), a 14-processor SPARC
- running Solaris 7, a 56-processor SGI Origin 2000 (cc/NUMA
- architecture), and a 4-processor IBM F50 (PowerPC-based) under AIX.
-
-
- More information
- ----------------
-
- For more information on Hoard, along with some nice performance graphs, see
-
- Hoard: A Fast, Scalable and Memory-Efficient Allocator
- for Shared-Memory Multiprocessors
- September 1999
- University of Texas Dept. of Computer Sciences
- UTCS-TR99-22.
-
- (Included in this distribution in docs/UTCS-TR99-22.ps.gz)
-
- The latest version of Hoard will always be available from the Hoard web page:
-
- <http://www.hoard.org>
-
-
- Feedback
- --------
-
- Please send any bug reports and information about new platforms Hoard
- has been built on to emery@cs.utexas.edu.
-
-
- Mailing lists
- -------------
-
- There are two mailing lists for Hoard: hoard-announce, a low-volume
- mailing list for announcements of new releases of Hoard, and hoard,
- a mailing list for Hoard-related discussions.
-
- To subscribe, go to the Hoard home page (www.hoard.org) and enter your
- e-mail address in the appropriate box.
-
-
- Acknowledgements
- ----------------
-
- In addition to those thanked in the paper, I'd like to thank Ganesan
- Rajagopal for submitting the autoconf and automake scripts, John
- Hickin and Paul Larson for improving the NT port, and Trey Boudreau
- for the BeOS port. Thanks also to Kevin Mills, Robert Fleischman,
- Martin Bachtold, and John Hickin.
-
-
- --
- Emery Berger | Parallel Programming
- emery@cs.utexas.edu | & Multiprogramming MP Groups
- <http://www.cs.utexas.edu/users/emery> | University of Texas at Austin
-
-